Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK)

نویسنده

  • Mohammad S. Khorsheed
چکیده

This paper presents a cursive Arabic text recognition system. The system decomposes the document image into text line images and extracts a set of simple statistical features from a narrow window which is sliding a long that text line. It then injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). HTK is a portable toolkit for speech recognition system. The proposed system is applied to a data corpus which includes Arabic text of more than 600 A4-size sheets typewritten in multiple computer-generated fonts. 2007 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Off-line Arabic Handwritten Isolated Character Recognition using Hidden Markov Models

This paper presents a recognition system for Arabic handwritten isolated characters. The recognition system is based on hidden Markov model (HMM). The entire system is capable of recognizing the Arabic handwritten characters. First, the system removes all the variation in the character images. Second, Features are extracted using the sliding window technique with HMM. Then, the HMM is used for ...

متن کامل

Arabic phonemes transcription using data driven approach

The efficiency and correctness of continuous Arabic Speech Recognition Systems (ARS) hinge on the accuracy of the language phoneme set. The main goal of this research is to recognize and transcribe Arabic phonemes using a data-driven approach. We used the Hidden Markov Toolkit (HTK) to develop a phoneme recognizer, carrying out several experiments with different parameters, such as varying numb...

متن کامل

Online Handwriting Recognition System for Assamese Language Based on Hmm and Svm Modelling

This work emphasises on the development of Assamese online character recognition system using HMM and SVM and performs a recognition performance analysis for both models. Recognition models using HTK (HMM Toolkit) and LIBSVM (SVM Toolkit) are generated by training 181 different Assamese Stokes. Stroke and Akshara level testing are performed separately. In stroke level testing, the confusion pat...

متن کامل

Isolated English Language Digit Recognition Using Hidden Markov Model Toolkit

The main purpose of the study was to develop a speech recognition system for isolated digits of English language using HTK. Speech, in addition to being a tool of communication, is also a symbol of identity and authorization. Two different corpora were collected of audio recordings of isolated digits of English language speakers, in which speakers read numeric digits. Both of the collected corp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2007